Discovering Word Senses for Polysemous Words Using Feature Domain Similarity
نویسندگان
چکیده
This paper presents a new clustering algorithm called DSCBC which is designed to automatically discover word senses for polysemous words. DSCBC is an extension of CBC (Pantel and Lin, 2002), and incorporates feature domain similarity: the similarity between the features themselves, obtained a priori from sources external to the dataset. By incorporating the feature domain similarity in clustering, DSCBC produces monosemous clusters (a cluster in one domain), thereby discovering individual senses of polysemous words. For evaluation, we apply the algorithm to Japanese and English adjectives, and compare the derived senses against manually created lexicons. The results show significant improvements over other clustering algorithms including CBC.
منابع مشابه
Paper has been my ruin : conceptual relations of polysemous senses q
Polysemous words have different but related meanings (senses), such as paper meaning a newspaper or writing material. Six experiments examined the similarity of word senses using categorization and inference tasks. The experiments found that subjects did not categorize together phrases that used a polysemous word in different senses, though they did when the word was used in the same sense. Dif...
متن کاملLearning Similarity-based Word Sense Disambiguation from Sparse Data
We describe a method for automatic word sense disambiguation using a text corpus and a machine-readable dictionary (MRD). The method is based on word similarity and context similarity measures. Words are considered similar if they appear in similar contexts; contexts are similar if they contain similar words. The circularity of this definition is resolved by an iterative, converging process, in...
متن کاملSimilarity-based Word Sense Disambiguation
We describe a method for automatic word sense disambiguation using a text corpus and a machinereadable dictionary (MRD). The method is based on word similarity and context similarity measures. Words are considered similar if they appear in similar contexts; contexts are similar if they contain similar words. The circularity of this definition is resolved by an iterative, converging process, in ...
متن کاملMaking Sense of Word Sense Variation
We present a pilot study of word-sense annotation using multiple annotators, relatively polysemous words, and a heterogenous corpus. Annotators selected senses for words in context, using an annotation interface that presented WordNet senses. Interannotator agreement (IA) results show that annotators agree well or not, depending primarily on the individual words and their general usage properti...
متن کاملPolysemy in Sentence Comprehension: Effects of Meaning Dominance.
Words like church are polysemous, having two related senses (a building and an organization). Three experiments investigated how polysemous senses are represented and processed during sentence comprehension. On one view, readers retrieve an underspecified, core meaning, which is later specified more fully with contextual information. On another view, readers retrieve one or more specific senses...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007